{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Kalman Filters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this lab you will:\n",
    "\n",
    "- Estimate Moving Average\n",
    "- Use Kalman Filters to calculate the mean and covariance of our time series\n",
    "- Modify a Pairs trading function to make use of Kalman Filters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What is a Kalman Filter?\n",
    "\n",
    "The Kalman filter is an algorithm that uses noisy observations of a system over time to estimate the parameters of the system (some of which are unobservable) and predict future observations. At each time step, it makes a prediction, takes in a measurement, and updates itself based on how the prediction and measurement compare.\n",
    "\n",
    "The algorithm is as follows:\n",
    "1. Take as input a mathematical model of the system, i.e.\n",
    "  * the transition matrix, which tells us how the system evolves from one state to another. For instance, if we are modeling the movement of a car, then the next values of position and velocity can be computed from the previous ones using kinematic equations. Alternatively, if we have a system which is fairly stable, we might model its evolution as a random walk. If you want to read up on Kalman filters, note that this matrix is usually called $A$.\n",
    "  * the observation matrix, which tells us the next measurement we should expect given the predicted next state. If we are measuring the position of the car, we just extract the position values stored in the state. For a more complex example, consider estimating a linear regression model for the data. Then our state is the coefficients of the model, and we can predict the next measurement from the linear equation. This is denoted $H$.\n",
    "  * any control factors that affect the state transitions but are not part of the measurements. For instance, if our car were falling, gravity would be a control factor. If the noise does not have mean 0, it should be shifted over and the offset put into the control factors. The control factors are summarized in a matrix $B$ with time-varying control vector $u_t$, which give the offset $Bu_t$.\n",
    "  * covariance matrices of the transition noise (i.e. noise in the evolution of the system) and measurement noise, denoted $Q$ and $R$, respectively.\n",
    "2. Take as input an initial estimate of the state of the system and the error of the estimate, $\\mu_0$ and $\\sigma_0$.\n",
    "3. At each timestep:\n",
    "  * estimate the current state of the system $x_t$ using the transition matrix\n",
    "  * take as input new measurements $z_t$\n",
    "  * use the conditional probability of the measurements given the state, taking into account the uncertainties of the measurement and the state estimate, to update the estimated current state of the system $x_t$ and the covariance matrix of the estimate $P_t$\n",
    "\n",
    "[This graphic](https://upload.wikimedia.org/wikipedia/commons/a/a5/Basic_concept_of_Kalman_filtering.svg) illustrates the procedure followed by the algorithm. \n",
    "\n",
    "It's very important for the algorithm to keep track of the covariances of its estimates. This way, it can give us a more nuanced result than simply a point value when we ask for it, and it can use its confidence to decide how much to be influenced by new measurements during the update process. The more certain it is of its estimate of the state, the more skeptical it will be of measurements that disagree with the state.\n",
    "\n",
    "By default, the errors are assumed to be normally distributed, and this assumption allows the algorithm to calculate precise confidence intervals. It can, however, be implemented for non-normal errors."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Install dependencies"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install pykalman"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install qq-training-wheels auquan_toolbox --upgrade"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import a Kalman filter and other useful libraries\n",
    "from pykalman import KalmanFilter\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "from scipy import poly1d\n",
    "\n",
    "from backtester.dataSource.yahoo_data_source import YahooStockDataSource\n",
    "from datetime import datetime\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Toy example: falling ball\n",
    "\n",
    "Imagine we have a falling ball whose motion we are tracking with a camera. The state of the ball consists of its position and velocity. We know that we have the relationship $x_t = x_{t-1} + v_{t-1}\\tau - \\frac{1}{2} g \\tau^2$, where $\\tau$ is the time (in seconds) elapsed between $t-1$ and $t$ and $g$ is gravitational acceleration. Meanwhile, our camera can tell us the position of the ball every second, but we know from the manufacturer that the camera accuracy, translated into the position of the ball, implies variance in the position estimate of about 3 meters.\n",
    "\n",
    "In order to use a Kalman filter, we need to give it transition and observation matrices, transition and observation covariance matrices, and the initial state. The state of the system is (position, velocity), so it follows the transition matrix\n",
    "$$ \\left( \\begin{array}{cc}\n",
    "1 & \\tau \\\\\n",
    "0 & 1 \\end{array} \\right) $$\n",
    "\n",
    "with offset $(-\\tau^2 \\cdot g/2, -\\tau\\cdot g)$. The observation matrix just extracts the position coordinate, (1  0), since we are measuring position. We know that the observation variance is 1, and transition covariance is 0 since we will be simulating the data the same way we specified our model. For the initial state, let's feed our model something bogus like (30, 10) and see how our system evolves."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tau = 0.1\n",
    "\n",
    "# Set up the filter\n",
    "kf = KalmanFilter(n_dim_obs=1, n_dim_state=2, # position is 1-dimensional, (x,v) is 2-dimensional\n",
    "                  initial_state_mean=[30,10],\n",
    "                  initial_state_covariance=np.eye(2),\n",
    "                  transition_matrices=[[1,tau], [0,1]],\n",
    "                  observation_matrices=[[1,0]],\n",
    "                  observation_covariance=3,\n",
    "                  transition_covariance=np.zeros((2,2)),\n",
    "                  transition_offsets=[-4.9*tau**2, -9.8*tau])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a simulation of a ball falling for 40 units of time (each of length tau)\n",
    "times = np.arange(40)\n",
    "actual = -4.9*tau**2*times**2\n",
    "\n",
    "# Simulate the noisy camera data\n",
    "sim = actual + 3*np.random.randn(40)\n",
    "\n",
    "# Run filter on camera data\n",
    "state_means, state_covs = kf.filter(sim)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(15,7))\n",
    "plt.plot(times, state_means[:,0])\n",
    "plt.plot(times, sim)\n",
    "plt.plot(times, actual)\n",
    "plt.legend(['Filter estimate', 'Camera data', 'Actual'])\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('Height');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At each point in time we plot the state estimate <i>after</i> accounting for the most recent measurement, which is why we are not at position 30 at time 0. The filter's attentiveness to the measurements allows it to correct for the initial bogus state we gave it. Then, by weighing its model and knowledge of the physical laws against new measurements, it is able to filter out much of the noise in the camera data. Meanwhile the confidence in the estimate increases with time, as shown by the graph below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot variances of x and v, extracting the appropriate values from the covariance matrix\n",
    "plt.figure(figsize=(15,7))\n",
    "plt.plot(times, state_covs[:,0,0])\n",
    "plt.plot(times, state_covs[:,1,1])\n",
    "plt.legend(['Var(x)', 'Var(v)'])\n",
    "plt.ylabel('Variance')\n",
    "plt.xlabel('Time');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Kalman filter can also do <i>smoothing</i>, which takes in all of the input data at once and then constructs its best guess for the state of the system in each period post factum. That is, it does not provide online, running estimates, but instead uses all of the data to estimate the historical state, which is useful if we only want to use the data after we have collected all of it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use smoothing to estimate what the state of the system has been\n",
    "smoothed_state_means, _ = kf.smooth(sim)\n",
    "\n",
    "# Plot results\n",
    "plt.figure(figsize=(15,7))\n",
    "plt.plot(times, smoothed_state_means[:,0])\n",
    "plt.plot(times, sim)\n",
    "plt.plot(times, actual)\n",
    "plt.legend(['Smoothed estimate', 'Camera data', 'Actual'])\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('Height');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example: Estimating Moving Average\n",
    "\n",
    "Because the Kalman filter updates its estimates at every time step and tends to weigh recent observations more than older ones, it can be used to estimate rolling parameters of the data. When using a Kalman filter, there's no window length that we need to specify. This is useful for computing the moving average or for smoothing out estimates of other quantities.\n",
    "\n",
    "Below, we'll use both a Kalman filter and an n-day moving average to estimate the rolling mean of a dataset. We construct the inputs to the Kalman filter as follows:\n",
    "\n",
    "* The mean is the model's guess for the mean of the distribution from which measurements are drawn. This means our prediction of the next value is equal to our estimate of the mean. \n",
    "* Hopefully the mean describes our observations well, hence it shouldn't change significantly when we add an observation. This implies we can assume that it evolves as a random walk with a small error term. We set the transition matrix to 1 and transition covariance matrix is a small number.\n",
    "* We assume that the observations have variance 1 around the rolling mean (1 is chosen randomly). \n",
    "* Our initial guess for the mean is 0, but the filter realizes that that is incorrect and adjusts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pykalman import KalmanFilter\n",
    "from backtester.dataSource.yahoo_data_source import YahooStockDataSource\n",
    "\n",
    "# Load pricing data for a security\n",
    "startDateStr = '2012/12/31'\n",
    "endDateStr = '2017/12/31'\n",
    "cachedFolderName = './yahooData/'\n",
    "dataSetId = 'testPairsTrading'\n",
    "instrumentIds = ['SPY','MSFT','ADBE']\n",
    "ds = YahooStockDataSource(cachedFolderName=cachedFolderName,\n",
    "                            dataSetId=dataSetId,\n",
    "                            instrumentIds=instrumentIds,\n",
    "                            startDateStr=startDateStr,\n",
    "                            endDateStr=endDateStr,\n",
    "                            event='history')\n",
    "\n",
    "# Get adjusted closing price\n",
    "data = ds.getBookDataByFeature()['adjClose']\n",
    "\n",
    "# Data for Adobe\n",
    "S1 = data['ADBE']\n",
    "# Data for Microsoft\n",
    "S2 = data['MSFT']\n",
    "\n",
    "# Take ratio of the adjusted closing prices\n",
    "x = S1/S2\n",
    "\n",
    "# Construct a Kalman filter\n",
    "kf = KalmanFilter(transition_matrices = [1],\n",
    "                  observation_matrices = [1],\n",
    "                  initial_state_mean = 0,\n",
    "                  initial_state_covariance = 1,\n",
    "                  observation_covariance=1,\n",
    "                  transition_covariance=.01)\n",
    "\n",
    "# Use the observed values of the price to get a rolling mean\n",
    "state_means, _ = kf.filter(x.values)\n",
    "state_means = pd.Series(state_means.flatten(), index=x.index)\n",
    "\n",
    "# Compute the rolling mean with various lookback windows\n",
    "mean30 = x.rolling(window = 10).mean()\n",
    "mean60 = x.rolling(window = 30).mean()\n",
    "mean90 = x.rolling(window = 60).mean()\n",
    "\n",
    "# Plot original data and estimated mean\n",
    "plt.figure(figsize=(15,7))\n",
    "plt.plot(state_means[60:], '-b', lw=2, )\n",
    "plt.plot(x[60:],'-g',lw=1.5)\n",
    "plt.plot(mean30[60:], 'm', lw=1)\n",
    "plt.plot(mean60[60:], 'y', lw=1)\n",
    "plt.plot(mean90[60:], 'c', lw=1)\n",
    "plt.title('Kalman filter estimate of average')\n",
    "plt.legend(['Kalman Estimate', 'X', '30-day Moving Average', '60-day Moving Average','90-day Moving Average'])\n",
    "plt.xlabel('Day')\n",
    "plt.ylabel('Price');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Observations\n",
    "\n",
    "As you can see, the estimate from Kalman Filter is usually somewhere between day 30 and day 60 moving average. This could be because the Filter updates its knowledge of the world based on the most recent data. The advantage of the Kalman filter is that we don't need to select a window length. It makes predictions based on the underlying model (that we set parameters for) and the data itself. We do open ourselves up to overfitting with some of the initialization parameters for the filter, but those are slightly easier to objectively define. There's no free lunch and we can't eliminate overfitting, but a Kalman Filter is more rigorous than a moving average and generally better."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another interesting application of Kalman Filters, Beta Estimation for Linear Regression can be found here [Dr. Aidan O'Mahony's blog.](http://www.thealgoengineer.com/2014/online_linear_regression_kalman_filter/)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll be using Kalman filters for Pairs trading the subsequent notebook. Make sure you try to run the examples given here with various hyperparameters for the underlying Kalman filter model to get comfortable with the same and developing a better understanding in the process. For example you can try out the following:\n",
    "1. Use multi dimensional transition matrices so as to use more of past information for making predictions at each point\n",
    "2. Try different values of observation and transition covariance"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: Pairs Trading\n",
    "\n",
    "In the previous notebook we made use of 60 day window for calculating mean and standard deviation of our time series. Now we'll be replacing that with Kalman filters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Let's get the same data that we used in the previous notebook"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "startDateStr = '2007/12/01'\n",
    "endDateStr = '2017/12/01'\n",
    "cachedFolderName = 'yahooData/'\n",
    "dataSetId = 'testPairsTrading2'\n",
    "instrumentIds = ['ADBE','MSFT']\n",
    "ds = YahooStockDataSource(cachedFolderName=cachedFolderName,\n",
    "                            dataSetId=dataSetId,\n",
    "                            instrumentIds=instrumentIds,\n",
    "                            startDateStr=startDateStr,\n",
    "                            endDateStr=endDateStr,\n",
    "                            event='history')\n",
    "data = ds.getBookDataByFeature()['adjClose']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### A quick visualization of error and standard deviations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "S1, S2 = data['ADBE'].iloc[:1762], data['MSFT'].iloc[:1762]\n",
    "ratios = S1/S2\n",
    "\n",
    "kf = KalmanFilter(transition_matrices = [1],\n",
    "              observation_matrices = [1],\n",
    "              initial_state_mean = 0,\n",
    "              initial_state_covariance = 1,\n",
    "              observation_covariance=1,\n",
    "              transition_covariance=.0001)\n",
    "\n",
    "state_means, state_cov = kf.filter(ratios.values)\n",
    "state_means, state_std = state_means.squeeze(), np.std(state_cov.squeeze())\n",
    "\n",
    "plt.figure(figsize=(15,7))\n",
    "plt.plot(ratios.values - state_means, 'm', lw=1)\n",
    "plt.plot(np.sqrt(state_cov.squeeze()), 'y', lw=1)\n",
    "plt.plot(-np.sqrt(state_cov.squeeze()), 'c', lw=1)\n",
    "plt.title('Kalman filter estimate')\n",
    "plt.legend(['Error: real_value - mean', 'std', '-std'])\n",
    "plt.xlabel('Day')\n",
    "plt.ylabel('Value');\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll be using the z score in the same way as before. Our strategy is to go long or short only in the areas where the |error| is greater than one standard deviation. Since 1 day price could be noisy, we'll be using 5 day average for a particular day's price"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Let's modify our trading function to make use of Kalman Filter while keeping the same logic for carrying out trades"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def trade(S1, S2):\n",
    "    # Compute rolling mean and rolling standard deviation\n",
    "    ratios = S1/S2\n",
    "    \n",
    "    kf = KalmanFilter(transition_matrices = [1],\n",
    "                  observation_matrices = [1],\n",
    "                  initial_state_mean = 0,\n",
    "                  initial_state_covariance = 1,\n",
    "                  observation_covariance=1,\n",
    "                  transition_covariance=.001)\n",
    "    \n",
    "    state_means, state_cov = kf.filter(ratios.values)\n",
    "    state_means, state_std = state_means.squeeze(), np.std(state_cov.squeeze())\n",
    "    \n",
    "    window = 5\n",
    "    ma = ratios.rolling(window=window,\n",
    "                               center=False).mean()\n",
    "    zscore = (ma - state_means)/state_std\n",
    "    \n",
    "    # Simulate trading\n",
    "    # Start with no money and no positions\n",
    "    money = 0\n",
    "    countS1 = 0\n",
    "    countS2 = 0\n",
    "    for i in range(len(ratios)):\n",
    "        # Sell short if the z-score is > 1\n",
    "        if zscore[i] > 1:\n",
    "            money += S1[i] - S2[i] * ratios[i]\n",
    "            countS1 -= 1\n",
    "            countS2 += ratios[i]\n",
    "        # Buy long if the z-score is < 1\n",
    "        elif zscore[i] < -1:\n",
    "            money -= S1[i] - S2[i] * ratios[i]\n",
    "            countS1 += 1\n",
    "            countS2 -= ratios[i]\n",
    "        # Clear positions if the z-score between -.5 and .5\n",
    "        elif abs(zscore[i]) < 0.5:\n",
    "            money += countS1*S1[i] + S2[i] * countS2\n",
    "            countS1 = 0\n",
    "            countS2 = 0\n",
    "#         print('Z-score: '+ str(zscore[i]), countS1, countS2, S1[i] , S2[i])\n",
    "    return money\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "trade(data['ADBE'].iloc[:1762], data['MSFT'].iloc[:1762])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The strategy is still profitable! You can try changing the hyperparameters of the Kalman Filter and see how it affects the PnL. The results might not be always better than the mean over moving window. You can try this with other instruments as well."
   ]
  }
 ],
 "metadata": {
  "environment": {
   "name": "tf2-2-2-gpu.2-2.m50",
   "type": "gcloud",
   "uri": "gcr.io/deeplearning-platform-release/tf2-2-2-gpu.2-2:m50"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
